Composable security of CV-MDI-QKD with secret key rate and data processing

We provide a rigorous security proof of continuous-variable measurement-device-independent quantum key distribution which incorporates finite-size effects and composable terms. In order to use realistic and optimized parameters and be able to derive results close to experimental expectations, we run protocol simulations supported by a Python library, including all the protocol operations, from simulating the quantum communication till the extraction of the final key.

Protocol and asymptotic secret key rate. Alice and Bob prepare coherent states |α� and |β� with amplitudes α = (1/2)(Q A + iP A ) and β = (1/2)(Q B + iP B ) , carried by modes A and B respectively. In particular, they encode the real vectorial variables α = (Q A , P A ) and β = (Q B , P B ) following the Gaussian distributions with variances σ 2 A and σ 2 B respectively. The two bosonic modes travel to an intermediate relay, where a Bell measurement is applied to them with outcome γ = (1/2)(Q R + iP R ) . We also use the notation γ = (Q R , P R ).
Eve interacts with the traveling modes via a two-mode attack where mode E 1 is mixed with mode A through a beam splitter with transmissivity T A and mode E 2 with mode B through a beam splitter with transmissivity T B (see Fig. 1). The CM of Eve's modes is given by where the bona fide conditions for g and g ′ are given in Ref. 21 . In fact, given the previous description (These attacks are collective Gaussian two mode attacks and represent the entangling cloner attack 35,36 counterpart of a channel comprised of two links.), the best attacks are those with g < 0 and g ′ > 0 . Taking into consideration this area of values, one can see that as |g| and |g ′ | become larger, the modes become more quickly and more strongly correlated (entangled). Then, one can choose g max = max{|g|, |g ′ |} and assume the attack with as the worst-case scenario. In such a case, the quadratures can be treated equivalently, as they follow the same probability distribution.
The outputs Q R and P R are dependent on the variables Q A , P A and Q B , P B according to the following equations: (1) P R =τ B P B + τ A P A + P z , Figure 1. Alice and Bob send coherent states |α� and |β� with modes A and B to the intermediate relay. Eve's modes E 1 and E 2 interact with the traveling modes via beam splitters with transmissivities T A and T B respectively. Eve's two-mode attack is characterized by thermal noise parameters ω 1 and ω 2 (see Eq. 3). Eve's modes are stored in a quantum memory waiting for an optimal measurement after the communication between the parties. In the EB representation, one introduces additional modes a and b in two-mode squeezed-vacuum (TMSV) states with modes A and B, respectively. These states have variances µ A = σ 2 A + 1 and µ B = σ 2 B + 1 , respectively. Then, the encoding process is simulated by a heterodyne measurement on modes a and b with corresponding measurement outcomes α and β . The initial CM of the systems is given by with V aA (µ A ) and V Bb (µ B ) being CMs of a TMSV state and Z = diag{1, −1} . The attack corresponds to applying a beam splitter with transmissivity T A between the modes A and E 1 and a beam splitter of transmissivity T B between modes B and E 2 . The beam splitter symplectic operation with transmissivity T is given by After the beam splitters, Alice's and Bob's modes A ′ and B ′ are mixed in a balanced beam splitter (i.e., T = 1/2 ) and conjugate homodyne measurements are applied to the output modes with outcomes grouped in the variable γ . In fact, starting from a CM with the following general form if we apply a homodyne measurement to mode M with outcome x M , the CM after the measurement will be given by ) for a Q(P)-measurement and (.) −1 being the pseudo-inverse operation.
In this description, the CM after the relay measurements is given by which is expressed in terms of conditional von Neumann entropies. Then by the assumption that Eve's systems E = E ′ 1 E ′ 2 e purify the whole output state, we have that the von Neumann entropy of the state ρ E|γ equals that of ρ ab|γ , and similar equivalence holds between ρ E|βγ and ρ a|βγ . These entropies are not dependent on the outcomes β and γ and can be expressed in terms of the symplectic eigenspectra {ν ± } and ν of of the CMs V ab|γ and V a|βγ respectively, so that with In terms of mutual information, the measurement variables α and β in the EB scheme are equivalent to the rescaled P&M variables, α and β . Then the conditioning on γ is equivalent to a displacement on the variables α and β so that key-extraction variables, x = (Q x , P x ) and y = (Q y , P y ) , need to be suitably constructed. In fact, the parties use the following relations An optimal option for the parameters u and v is given by assuming a minimal correlation between the new variables, x and y , and the relay outputs. This is explained by the fact that Eve should know as less as possible about x and y by knowing γ . Therefore, we impose so to obtain (These are the regression coefficients. Given a bipartition of a multivariate Gaussian distribution {x 1 , x 2 } with CM , the regression coefficients are given by the matrix 12 −1 22 . One may write that www.nature.com/scientificreports/ Therefore, one may write where the first equality is proven in Ref. 27 , Appendix I. The quantum mutual information between Eve's system E = E ′ 1 E ′ 2 e and Bob's key-extraction variable y when she has access to the variable γ is given by 37 , Lemma 7.4.4 and it is equal to the Holevo information χ(E : y|γ ) since y is a classical variable. In particular, we have that, given γ , there is a function y = f (β) determined by the relations in Eqs. (31) and (32) such that β = f −1 (y) . This allows us to apply the data processing inequality in both directions with respect to y and β and obtain At this point, one may define the asymptotic key rate which is calculated starting from the CM in Eq. (17) as in Ref. 21 . Note that ζ is the reconciliation parameter defined later in Eq. (70). This parameter accounts for the proportion of mutual information given to Eve during the public channel communication of the parties performing a non-ideal reconciliation process.

Parameter estimation.
Here we follow the PE proposed in Ref. 33 . An alternative way is described in 27 , Appendix II.B based on extra simplifying assumptions. In particular, based on m samples . . , m , the parties calculate the maximum likelihood estimators (MLEs) of the covariances . These estimators are given by From these, they define estimators for T A and T B , i.e., Then they define an estimator for σ 2 z . This estimator is given by Note here that m = N − n where N is the number of signals sent through the channel and n is the number of signals devoted to secret key extraction for each block. In a practical situation, where the transmission can be assumed stable over a large number of blocks n bks , one can use m signals on average from each block in order to estimate the channel parameters. Thus the parties sacrifice M = mn bks for PE and the corresponding rate is given by The mutual information and the correlation between the two variables x and y are connected as follows 38 , Eq. (8.56) (see also 27 , Eq. (2)): One may derive the estimator for the correlation between the variables by replacing with the MLEs of the transmissivities and noise into the mutual information, namely, (50) (62) www.nature.com/scientificreports/ which helps in the calculation of the a priori probabilities for the initialization step of the decoding sum-product algorithm of the error correction step 31,32 .

Data reconciliation. The parties apply the transformations of Eqs. (29)-(32) based on the quantities in
Eqs. (34)-(37) calculated via the MLEs of the previous section. Bob and Alice combine their data from the Q and P quadratures into one variable. In particular, Alice and Bob apply the following mapping to their data: in order to obtain 2n samples from each block. Afterwards, the parties apply the EC procedure using non-binary LDPC codes following Ref. 32 , Sect. III.B (for extra details see also Ref. 31 ). More specifically, they define the worst-case estimator (up to an error probability ǫ ent ) for the reconciliation parameter ζ appearing in Eq. (43) which is given by is the worst-case scenario entropy of the raw-key string described by l , the normalized and discretized version of y. In particular, 2 H(l) is the estimator of the previous entropy, −R code q + p is the maximum data exchanged for reconciliation per channel use when one uses a non-binary LDPC code with the rate R code associated with the Galois filed G(2 q ) and discretization of p bits. We take into consideration here that Bob applies the LDPC encoding only to the q bits of l while the rest p − q bits are entirely sent through the public channel. I(x : y)| T A , T B , � is the ideal mutual information between the parties according to the data (i.e., after parameter estimation) which appears in Eq. (63). In fact by replacing ζ in the previous equation, one obtains the practical key rate The parties started with two different sequences of n bks blocks each with 2N initial samples and, in the process (after PE and EC), these are reduced to two indistinguishable binary sequences (with probability 1 − ǫ EC ) that consist of p EC n bks blocks each carrying 2np bits: Note that l n bin corresponds to the part of the original variable l in binary form that has been sent through the public channel using the LDPC encoding, l n bin is the part in binary form that has been sent unchanged through the public channel, and l n bin is the binary form of the successfully decoded and verified part with probability p EC . The parties need to apply on these sequences the appropriate amount of compression during the PA step so that the previous raw-data strings become a secret key. This is determined by the composable key rate calculated below. Concatenating appropriately the previous parts, the parties end up with the raw data sequences l bin for Bob and ˜l bin for Alice in binary form.
Composable security. We adopt the composable framework security analysis presented in Ref. 17 , Appendix G to the requirements of the CV-MDI-QKD protocol. More specifically, the secret key is characterized by certain properties stemming from certain post-processing procedures, and there is an overall probability ǫ that the key fails to possess at least one of these properties.
According to the previous analysis, one may write for the length of the secret key 17 , Eq. (G12): where l is defined according to the bidirectional mapping where l Q (l P ) is the l instance corresponding to the q(p)-quadrature. Note that here we have used a virtual concatenation assumption (see Appendix A of 32 ) to pass from a description based on the single-quadrature variable l (normalized and discetized) to one based on the vectorial variable l. One may also observe that, in this case, Eve's system is described by the group of modes E plus the classical variable γ . In particular, H(l|Eγ ) is the conditional von Neumann entropy of the variable l conditioned on E and γ , and 39 , Eq. (61) (72) l bin := l n bin l n bin ≃l bin := l n bin l n bin .
(73) In particular, this mutual information is between a classical variable l and a quantum system E (conditioned on another classical variable γ ). This is therefore the Holevo information χ(E : l|γ ) , i.e., an upper bound for the accessible information on l given that Eve possesses E (and knows the variable γ ). Therefore, by reversing Eq. (76), one may write where H(l|γ ) = H(l) (see Eq. 33) is the Shannon entropy of l. In more detail, using the data processing inequality, we manipulate Eve's Holevo bound as follows Therefore we have We may replace Eq. (79) in Eq. (73) and then set In this way, we derive where we include the asymptotic secret key rate of Eq. (43). One may replace R asy with R EC M of Eq. (71) into Eq. (81) to obtain (see also 17,32,39 ) with composable terms The overall security parameter is equal to where we note that the factor 3 is due to the fact the ǫ PE is defined per parameter.
One may also derive an approximate key rate, which is not based on the data postprocessing where R M is the rate in Eq. (63) but where the estimators are approximated using the initial values of the simulation (see, e.g., the steps in Sects. III.B.1 and III.B.2 in Ref. 31 ). In fact, one may define R M from Eq. (63) but where the following substitutions have been made: and (75) � AEP (ǫ s , |L|) = 4 log 2 ( |L| + 2) log 2 (2/ǫ 2 s ), (85) ǫ = ǫ cor + ǫ h + ǫ s + p EC (3ǫ PE + ǫ ent ), Privacy amplification. Now the parties are ready to apply the appropriate amount of compression indicated by Eq. (82) on their binary strings in Eq. (72) to create a secret key through the PA step. To achieve this, they compress them via universal hashing. More specifically, they apply a modified Toeplitz matrix G(I r |T r,2np−r ) to their sequences in order to extract the secret key 40 where r = 2npR , the Toeplitz matrix T r,2np−r is of r × 2np − r dimensions and I r is the r × r identity matrix, with l r bin we denote the first r bits of the raw key string and with l 2np−r bin the rest.

Simulation and results
In our simulations, the attack is handled by initially defining values for the excess noise of Alice's ( ξ A ) and Bob's ( ξ B ) channels. These values, along with the transmissivity of each channel, constitute the thermal noise ω for each channel respectively as follows: Using Alice's thermal noise value, we can estimate the correlation parameter g from Eq. (11). We now have all components to find the excess noise variance , which is shown in Eq. (10). Finally, the noise variance σ 2 z is calculated through Eq. (9).
The parameters used to execute the simulations are listed in Table 2. To begin with, the symmetric version of the protocol is examined, which means that the signal variance and the channel parameters will be the same between Alice and Bob, i.e. µ A = µ B , T A = T B and ξ A = ξ B .
To find a signal variance range, for which the composable rate R becomes positive, the asymptotic rate R asy was maximized using a modulation variance optimization function. Table 1 shows that a positive R can be achieved, when 45 ≤ µ A , µ B ≤ 50 . Under these conditions, the SNR spans from approximately 10 to 11.89. As presented in the Table, the choice of the reconciliation efficiency is important, when trying to maximize the value of R. It is important to note that neither the asymptotic nor the composable rate will further grow, as the signal variances increase. This means that, at some point, the rates will saturate and eventually become negative again.
Knowing the variables, for which the composable rate becomes positive, we can now identify what is the maximum tolerable excess noise in the system. For this purpose, µ A = µ B = 46 is chosen, in order to produce a high rate (and therefore tolerate more excess noise), while performing a faster EC procedure (when compared to that for µ A = µ B = 49 ). Therefore, in Fig. 2, the symmetric case of the protocol is considered again, with µ A = µ B = 46 and with the excess noise being variable. As observed from the plot, ξ can take values up to 0.008, before the protocol is deemed unsafe for key distribution.
Next, we investigate the asymmetric version of the protocol, where the channel parameters, as well as the signal variances, are different between Alice and Bob. Here, two cases are examined: Fig. 3 shows the behaviour of Alice's transmissivity against the composable key rate and Fig. 4 displays the maximum tolerable values for Alice's excess noise. Regarding the former case, it is possible for Alice's channel to reach transmissivity values of (91)

Conclusion
In this study, we give a rigorous proof for the composable security of the Gaussian-modulated CV-MDI protocol and we calculate its rate. Depending on this rate, the appropriate amount of compression is applied, in order to extract a secret key. We simulate the quantum communication step and we apply all the classical postprocessing steps on the generated data. All of these procedures are performed by means of an associated Python library.  Fig. 2) Value (Fig. 3) Value (Fig. 4) T  Other parameters are taken as in Table 2.  Table 2.  Table 2.